Assignment 1
Report for the first assignment of Effective MLOps: Model Development course.
Created on March 24|Last edited on April 2
Comment
Content
Problem and Dataset
- The problem I picked is an ordinal regression task, namely predicting review rating for books on Goodreads (Kaggle competition)
- The data is composed of approximately 0.9M book reviews, containing the book, the author, the review text, and its stats. For a full description, check out the information on Kaggle
EDA Raw Data
- The book_id feature is ordinal instead of categorical
- The date columns (i.e. date_added, date_updated, read_at, started_at) are strings instead of datetime
- The features read_at and started at have many missing entries
- Rating 4 is the most common => the naive baseline is setting every review's rating to 4 and computing the F1 score
- People tend to leave a comment when the review is either really bad or really good
Data Processing
- Absolute values for votes and comments (cannot be negative)
- Fill in read_at missing values with added_at
- Convert date strings to pandas datetime
- Fill in missing values with the mode of the respective column
- Derive additional features (missing_started_at, reading_duration, review_length, spoiler, hour, month, dayofweek, year, ...)
- Drop review text
- Convert id features to category type
Naïve Baseline
- Set all ratings in the validation set to 4
- F1 score - 0.08
Baseline Model
- LightGBM Classifier (default parameters)
- Below is the key metric (F1 score), training charts, predictions and feature importance tables.
- Conclusions:
- F1 score improve drastically in comparison with the naïve baseline, even though the model was trained using the default hyperparameters
- Looking at the evolution of the F1 score and multi logloss over the iterations, one can see that they did not reach a plateau yet. Hence, increasing the number of iterations would likely increase the performance of the model
- The feature importance bar plot, suggests that the book and user id are the most important, implying that certain users and books are reviewed either really well or really poorly
- F1 score - 0.38
💡
Add a comment